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ABSTRACT 

Iterative reweighted algorithms, as a class of algorithms for sparse 
signal recovery, have been found to have better performance than 
their non-reweighted counterparts. However, for solving the prob- 
lem of multiple measurement vectors (MMVs), all the existing 
reweighted algorithms do not account for temporal correlation 
among source vectors and thus their performance degrades sig- 
nificantly in the presence of correlation. In this work we propose an 
iterative reweighted sparse Bayesian learning (SBL) algorithm ex- 
ploiting the temporal correlation, and motivated by it, we propose a 
strategy to improve existing reweighted £2 algorithms for the MMV 
problem, i.e. replacing their row norms with Mahalanobis distance 
measure. Simulations show that the proposed reweighted SBL al- 
gorithm has superior performance, and the proposed improvement 
strategy is effective for existing reweighted £2 algorithms. 

Index Terms — Sparse Signal Recovery, Compressive Sensing, 
Iterative Reweighted £2 Algorithms, Multiple Measurement Vectors, 
Sparse Bayesian Learning, Mahalanobis distance 

1. INTRODUCTION 

The multiple measurement vector (MMV) model for sparse signal 
recovery is given by IT) 

Y = #X + V, (1) 

where <I> £ WL NxM (N <C M) is the dictionary matrix whose any 
N columns are linearly independent, Y E R JVxi is the measure- 
ment matrix consisting of L measurement vectors, X £ M. MxL 
is the source matrix with each row representing a possible source, 
and V is the white Gaussian noise matrix with each entry satisfy- 
ing Vij ~ 7V(0, A). The key assumption under the MMV model is 
that the support (i.e. locations of nonzero entries) of every column 
vector X.i (Vi) [] is identical (referred as the common sparsity as- 
sumption in the literature (TJ). The MMV problem is often encoun- 
tered in practical applications, such as neuroelectromagnetic source 
localization and direction-of-arrival estimation. 

Most algorithms for the MMV problem can be roughly divided 
into greedy methods, methods based on mixed norm optimization, 
iterative reweighted methods, and Bayesian methods. 

Iterative reweighted methods have received attention because of 
their improved performance compared to their non-reweighted coun- 
terparts [2. 3 1. In [3], an iterative reweighted £\ minimization frame- 
work is employed. The framework can be directly used for the MMV 
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'The i-th column of X is denoted by X.^. The i-th row of X is denoted 
by X;. (also called the i-th source). 



problem and many MMV algorithms based on mixed norm optimiza- 
tion can be improved via the framework. On the other hand, iterative 
reweighted £2 algorithms were also proposed [2, 4|. The reweighted 
£2 minimization framework for the MMV problem (in noisy case) 
computes the solution at the (k + l)-th iteration as follows Q 

X (fc+1) = argminllY-^Xll^ + AV^dlX^H,)^) 

X * — * 

i 

= W (fc) * T (AI + *W (fe) $ T ) _1 Y (3) 
where typically q = 2, W' fe ' is a diagonal weighting matrix at the 

(k) (k) 

fc-fh iteration with i-th diagonal element being l/w\ , and w\ 
depends on the previous estimate of X. Recently, Wipf et al (2l 
unified most existing iterative reweighted algorithms as belonging to 
the family of separable reweighted algorithms, whose weighting Wi 
of a given row X;. at each iteration is only a function of that indi- 
vidual row from the previous iteration. Further, they proposed non- 
separable reweighted algorithms via variational approaches, which 
outperform many existing separable reweighted algorithms. 

In our previous work [5 6 | we showed that temporal correla- 
tion in sources X^. seriously deteriorates recovery performance of 
existing algorithms and proposed a block sparse Bayesian learning 
(bSBL) framework, in which we incorporated temporal correlation 
and derived effective algorithms. These algorithms operate in the 
hyperparameter space, not in the source space as most sparse sig- 
nal recovery algorithms do. Therefore, it is not clear what the con- 
nection of the bSBL framework is to other sparse signal recovery 
frameworks, such as the reweighted £2 in {2^. In this work, based 
on the cost function in the bSBL framework, we derive an itera- 
tive reweighted £2 SBL algorithm with superior performance, which 
directly operates in the source space. Furthermore, motivated by 
the intuition gained from the algorithm and analytical insights, we 
propose a strategy to modify existing reweighted £2 algorithms to 
incorporate temporal correlation of sources, and use two typical al- 
gorithms as illustrations. The strategy is shown to be effective. 

2. THE BLOCK SPARSE BAYESIAN LEARNING 
FRAMEWORK 

The block sparse Bayesian learning (bSBL) framework [5. 6 1 trans- 
forms the MMV problem to a single measurement vector problem. 
This makes the modeling of temporal correlation much easier. First, 
we assume the rows of X are mutually independent, and the density 
of each row Xi. is multivariate Gaussian, given by 

p(Xi.; 7i, B<) ~jV(0,7iBi), i = l,---,M 

2 For convenience, we omit the superscript, k, on the right hand side of 
learning rules in the following. 



where j, is a nonnegative hyperparameter controlling the row spar- 
sity of X as in the basic sparse Bayesian learning J7] [8). When 
ji — 0, the associated i-th row of X becomes zero. Bi is an un- 
known positive definite correlation matrix. 



By letting y = vec(Y ) 6 



II 



=(X J 



and v = vec(V ), where vec(A) denotes the 



vectorization of the matrix A formed by stacking its columns into a 
single column vector, we can transform the MMV model ([T} to the 
block single vector model as follows 



rameters. To overcome the overfitting, we simplify and consider us- 
ing one matrix B to model all the source covariance matrixes. Thus 
So = r ® B with r = diag([7i, • • ■ , 7a/]). Simulations will show 
that this simplification leads to good results even if different sources 
have different temporal correlations (see Section|5]. 

In order to transform the cost function ((5} to the source space, we 



use the identity: y (AI + DS D 



min x 



y-Dx|| 



x T S x x] , by which we can upper-bound the cost function ((5} and 
obtain the bound 



y = Dx + v. (4) 

To elaborate on the block sparsity model (|4j, we rewrite it as y = 
[*i <8> I L ,--- ,*m ® Ii][xf,--- ,x^] T + v = E l =l(*i ® 
Ii)x, + v, where <&i is the i-th column of <fr, x* G R Lxl is the 
i-th block in x and it is the transposed i-th row of X in the original 
MMV model Q), i.e. x* = X.J.. K nonzero rows in X means K 
nonzero blocks in x. Thus we refer to x as block-sparse. 

For the block model ((4), the Gaussian likelihood is p(y|x; A) ~ 
A/^i x (Dx, AI). The prior for x is given by p(x;7i, Bj, V£) ~ 
A4(0, So), where So is a block diagonal matrix with the i- 
th diagonal block 7iBi (Vi). Given the hyperparameters = 
{A, ji, Bi, Vi}, the Maximum A Posterior (MAP) estimate of x can 
be directly obtained from the posterior of the model. To estimate 
these hyperparameters, we can use the Type-II maximum likelihood 
method [ 8 1, which marginalizes over x and then performs maximum 
likelihood estimation, leading to the cost function: 

£(0) 4 -2 log /"p(y|x;A)p(x; 7l! B l! Vi)dx 



= log |AI + DS D T | + y T (AI + DS D T )- 1 y,(5) 

where7 = [71, ■ ■ ■ ,7a/] t . We refer to the whole framework includ- 
ing the solution estimation of x and the hyperparameter estimation 
as the bSBL framework. Note that in contrast to the original SBL 
framework, the bSBL framework models the temporal correlation 
structure of sources in the prior density via the matrix B^ (Vi). 

3. ITERATIVE RE WEIGHTED SPARSE BAYESIAN 
LEARNING ALGORITHMS 

Based on the cost function l|5}, we can derive efficient algorithms that 
exploit temporal correlation of sources [5. 6|. But these algorithms 
directly operate in the hyperparameter space (i.e. the 7 -space). So, 
it is not clear what their connection is to other sparse signal recov- 
ery algorithms that directly operate in the source space (i.e. the X- 
space) by minimizing penalties on the sparsity of X. Particularly, it 
is interesting to see if we can transplant the benefits gained from the 
bSBL framework to other sparse signal recovery frameworks such as 
the iterative reweighted £2 minimization framework ((2}, improving 
algorithms belonging to those frameworks. Following the approach 
developed by Wipf et al 1 2 1 for the single measurement vector prob- 
lem, in the following we use the duality theory 1 9 1 to obtain a penalty 
in the source space, based on which we derive an iterative reweighted 
algorithm for the MMV problem. 

3.1. Algorithms 

First, we find that assigning a different covariance matrix B^ to each 
source Xj. will result in overfitting in the learning of the hyperpa- 



3 We denote the L X L identity matrix by 1^. When the dimension is 
evident from the context, for simplicity we use I. ® is the Kronecker product. 



£(x, 7 ,B) =log|AI + DS D J 



- x \\y- 



-Dx||i+x T So 1 x. 



By first minimizing over 7 and B and then minimizing over x, we 
can get the cost function in the source space: 

x = argmin ||y - Dx||2 + A5tc(x), (6) 

X 

where the penalty ijtc( x ) is defined by 

o TC (x)= min x t Sq *x + log IAI + DS D T |. (7) 

7^0, B^O 

From the definition (|7) we have 

Stc(x) < x T So 1 x + log|AI + DS D T | 

= x T So 1 x + log|S | +log|iD T D + S,7 1 | + ATI, log A 
< x T So x x + log I So I + z T j- 1 - ,f (z) + TVLlog A 



where in the last inequality we have used the conjugate relation 
1 



logl^D + So 1 



T -1 

min z 7 

z>-0 



(8) 
T , and 



Here we denote 7 -1 = [7^, • ■ • , 7a "/] t , z = [z%, ■ ■ ■ , z M 
/*(z) is concave conjugate of /(7" 1 ) — log |^D T D + Sq~ |. Fi 
nally, reminding of So = T ® B, we have 



Stc(x) < TVLlogA - /*(z) + A/ log |B| + 



E 



xfB _1 Xj + . 



L log ji 



(9) 



Therefore, to solve the problem l(6) with (|9), we can perform the 
coordinate descent method over x, B, z and 7, i.e, 



mm 

x.B,z^0,7^0 



Dxll 



x/B 



x. 



+Llog 7i +Mlog|B|-r(z) 



(10) 



Compared to the framework l(2), we can see 1 /ji can be seen as the 
weighting for the corresponding xf B _1 Xi. But instead of applying 
l q norm on Xj (i.e. the i-th row of X) as done in existing iterative 
reweighted I2 algorithms, our algorithm computes xf B _1 Xi, i.e. 
the quadratic Mahalanobis distance of x» and its mean vector 0. 
By minimizing d 10b over x, the updating rule for x is given by 



,(*+!) 



S D T (AI + DS D T )- 1 y- 



(11) 



According to the dual property [9 1, from the relation (JS], the optimal 
z is directly given by 

dlogllD^D + S^ 1 ! 



Lji - 7 ?Tr[BDr(AI + DS D T ) 'd.J , V* (12) 



where D; consists of columns of D from the ((i — 1)L + l)-th to 
the (iL)-th. From dlOt the optimal 7; for fixed x, z, B is given by 
ji — i[xf B~ 1 Xi + Zf\. Substituting Eq.dl2t into it, we have 



-^-Tr[BDf (AI + DE D T ) 1 D l ],V ? (13) 
By minimizing j 10b over B, the updating rule for B is given by 

M T 

B (fc+U = B/|[B||jr, with B = V^^. (14) 

The updating rules ( lilt d 1 3 1 > and d 14b are our reweighted algo- 
rithm minimizing the penalty based on quadratic Mahalanobis dis- 
tance of Xj. Since for a given i, the weighting 1 /ji depends on the 
whole estimated source matrix in the previous iteration (via B and 
So), the algorithm is a nonseparable reweighted algorithm. 

The complexity of this algorithm is high because it learns the 
parameters in a higher dimensional space than the original problem 
space. For example, consider the bSBL framework, in which the 
dictionary matrix D is of the size NL x ML, while in the original 
MMV model the dictionary matrix is of the size N x M. We use 
an approximation to simplify the algorithm and develop an efficient 
variant. Using the approximation: 

(AIjvl + DEoD^ 1 ~ (AIjv + *r* T ) - 1 ® B~\ (15) 

which takes the equal sign when A = or B = I, the updating rule 
i ll It can be transformed to 

X (fc+1) = W* T (AI + *W* T ) -1 Y, (16) 

where W = di&g([l/tui, • ■ • , 1/wm]) with Wi = Using the 

same approximation, the last term in ( 113b becomes 

Tr[BDf (XI NL +DS D T )" 1 D i ] 

» Tr [B(*f ® I) [(AIjv + ^W^)" 1 g> B _1 ] (#< g> I)] 

= L&J(Xl N + *W* T ) _1 *i. 
Therefore, from the updating rule of ji d!3t we have 

w (k+l) = J 1 x . B -1 X T + {(w -l + 1 $ T $) -l } ..j ~\ 

Accordingly, the updating rule for B becomes 

M 

B (fe+1) = B/||5||jr, with B = J2 WiX -^ X *- ( 18 ) 

1=1 

We denote the updating rules ([16J {F7j and ([18} by ReSBL-QM. 
With the aid of singular value decomposition, the computational 
complexity of the algorithm is 0(N 2 M) (The effect of L can be 
removed by using the strategy in 171). 

3.2. Estimate the Regularization Parameter A 

To estimate the regularization parameter A, many methods have been 
proposed, such as the modified L-curve method [1|. Here, straight- 
forwardly following the Expectation-Maximization method in (5) 
and using the approximation d!5t , we derive a learning rule for A, 
given by: 

= ^ IIY - *X|]| + A Tr [G(AI + G)- 1 ] . 
where G = *W* T . 



3.3. Theoretical Analysis in the Noiseless Case 

For the noiseless inverse problem Y = <1?X, denote the generating 
sources by X gcn , which is the sparsest solution among all the pos- 
sible solutions. Assume X gcn is full column-rank. Denote the true 
source number (i.e. the true number of nonzero rows in X gen ) by 
Kq. Now we have the following result on the global minimum of 
the cost function {5): 

Theorem 1 In the noiseless case, assuming Kq < {N + 
L) /2, for the cost function (5$ the unique global minimum 7 = 
[71, — i7JVf] produces a source estimate X that equals to X gcn 
irrespective of the estimated Bi, Vi, where X is obtained from 
vec(X T ) = x and x is computed using Ea. il il l. 

The proof is given in [6|. The theorem implies that even if the 
estimated Bi is different from the true Bi, the estimated sources are 
the true sources at the global minimum of the cost function. There- 
fore the estimation error in Bi does not harm the recovery of true 
sources. As a reminder, in deriving our algorithm, we assumed 
Bi = B (Vi) to avoid overfitting. The theorem ensures that this 
strategy does not harm the global minimum property. 

In our work [ 6 1 we have shown that B plays the role of whitening 
sources in the SBL procedure, which can be seen in our algorithm 
as well. This gives us a motivation to improve some state-of-the-art 
reweighted £2 algorithms by whitening the estimated sources in their 
weighting rules and penalties, detailed in the next section. 

4. MODIFY EXISTING REWEIGHTED £ 2 METHODS 

Motivated by the above results and our analysis in [6 |, we can modify 
many reweighted £2 algorithms via replacing the £2 norm of Xj. by 
some suitable function of its Mahalanobis distance. Note that similar 
modifications can be applied on reweighted £\ algorithms. 

The regularized M-FOCUSS (TJ is a typical reweighted £2 al- 
gorithm, which solves a reweighted £2 minimization with weights 
= (IIX^' ll!)^ 2-1 m eacn iteration. It is given by 

x (fc+i) = w (fe) # T (AI + #W (fc) # T )" 1 Y (19) 

W (fc) = diag{[lM fc \... ,1/wf?]} 

W W = (||x«|[^/ 2 -\p e [0,2],Vi (20) 

We can modify the algorithm by changing ((20) to the following one: 

w (k) = ( X W( B W)-i( X «) T )* /a - 1 , pe [0,2],Vi(21) 

The matrix B can be calculated using the learning rule ( 118) . We 
denote the modified algorithm by tMFOCUSS. 

In [4 1 Chartrand and Yin proposed an iterative reweighted £2 
algorithm based on the classic FOCUS S algorithm. Its MMV exten- 
sion (denoted by Iter-L2) changed ( 1201 ) to: 

w f> = (\\^f 2 +e) p/2 -\pe[0,2],\fi (22) 

Their algorithm adopts the strategy: initially use a relatively large e, 
then repeating the process of decreasing e after convergence and re- 
peating the iteration H9) , dramatically improving the recovery abil- 
ity. Similarly, we can modify the weighting d22t to the following 
rule incorporating the temporal correlation of sources: 

W W = (xW(B( fc »)- 1 (X 1 (fc) ) T + e ) p/2 - 1 , P G[0,2],Y23) 

and adopts the same e-decreasing strategy. B is also given by d!8t . 
We denote the modified algorithm by tIter-L2. 



The proposed tMFOCUSS and tIter-L2 have convergence prop- 
erties similar to M-FOCUSS and Iter-L2, respectively. Due to space 
limit we omit theoretical analysis, and instead, provide some repre- 
sentative simulation results in the next section. 



5. EXPERIMENTS 

In our experiments, a dictionary matrix 3? £ R JVxAf was created 
with columns uniformly drawing from the surface of a unit hyper- 
sphere. The source matrix X gcn £ R M x L was randomly generated 
with K nonzero rows of unit norms, whose row locations were ran- 
domly chosen. Amplitudes of the i-th nonzero row were generated as 
an AR(1) process whose AR coefficient was denoted by ftQ. Thus 
Pi indicates the temporal correlation of the i-th source. The mea- 
surement matrix was constructed by Y = <&X gcn + V, where V 
was a zero-mean homoscedastic Gaussian noise matrix with variance 
adjusted to have a desired value of SNR. For each different experi- 
ment setting, we repeated 500 trials and averaged results. The perfor- 
mance measurement was the Failure Rate defined in |7 |, which indi- 
cated the percentage of failed trials in the 500 trials. When noise was 
present, since we could not expect any algorithm to recover X gen ex- 
actly, we classified a trial as a failure trial if the K largest estimated 
row-norms did not align with the support of X gGn . The compared al- 
gorithms included our proposed ReSBL-QM, tMFOCUSS, tIter-L2, 
the reweighted £ 2 SBL in | 2| (denoted by ReSBL-L2), M-FOCUSS 
1 1 1, Iter-L2 presented in Section|4] and Candes' reweighted l\ algo- 
rithm (3) (extended to the MMV case as suggested by 1 2|, denoted by 
Iter-Ll). For tMFOCUSS, M-FOCUSS, and Iter-L2, we set p = 0.8, 
which gave the best performance in our simulations. For Iter-Ll, we 
used 5 iterations. 

In the first experiment we fixed TV = 25, M = 100, L = 3 and 
SNR = 25dB. The number of nonzero sources K varied from 10 to 
16. FigQ](a) shows the results when each ft was uniformly chosen 
from [0, 0.5) at random. Fig[T](b) shows the results when each ft 
was uniformly chosen from [0.5, 1) at random. 

In the second experiment we fixed N = 25, L = 4, K — 12, 
and SNR = 25dB, while M/N varied from 1 to 25. ft (Vi) in Figg] 
(a) and (b) were generated as in Fig[T](a) and (b), respectively. This 
experiment aims to see algorithms' performance in highly underde- 
termined inverse problems, which met in some applications such as 
neuroelectromagnetic source localization. 

From the two experiments we can see that: (a) in all cases, the 
proposed ReSBL-QM has superior performance to other algorithms. 



4 Since in our experiments the measurement vector number is very small 
(L = 3 or 4), generating sources as AR(1) with various AR coefficient values 
is sufficient. 
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Fig. 1. Performance when the nonzero source number changes. 
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Fig. 2. Performance when M/N changes. 



capable to recover more sources and solve more highly underdeter- 
mined inverse problems; (b) without considering temporal correla- 
tion of sources, existing algorithms' performance significantly de- 
grades with increasing correlation; (c) after incorporating the tempo- 
ral structures of sources, the modified algorithms, i.e. tMFOCUSS 
and tIter-L2, have better performance than the original M-FOCUSS 
and Iter-L2, respectively. Also, we noted that our proposed algo- 
rithms are more effective when the norms of sources have no large 
difference (results are not shown here due to space limit). 

6. CONCLUSIONS 

In this paper, we derived an iterative reweighted sparse Bayesian al- 
gorithm exploiting the temporal structure of sources. Its simplified 
variant was also obtained, which has less computational load. Moti- 
vated by our analysis we modified some state-of-the-art reweighted 
£2 algorithms achieving improved performance. This work not only 
provides some effective reweighted algorithms, but also provides a 
strategy to design effective reweighted algorithms enriching current 
algorithms on this topic. 
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